A Survey on Multi-relational Classification of Imbalanced Databases
نویسندگان
چکیده
44 Hemlata Pant and Dr. Reena Srivastava A SURVEY ON MULTI-RELATIONAL CLASSIFICATION OF IMBALANCED DATABASES Hemlata Pant, Dr. Reena Srivastava Research Scholar, School of Engineering, BBD University, Lucknow Dean, School of Computer Applications, BBD University, Lucknow ____________________________________________________________________________________ ABSTRACT: The multirelational classification algorithms are designed to search for patterns across multiple interlinked relations in a relational database. For better classification these methods search for relevant features from a target relation and the relations related to it. Most of these methods are based on the assumption that the classes in the target relation are equally represented. They thus tend to produce poor predictive performance over the underrepresented class in the data. In case of imbalance database the problem of learning from imbalanced data is a relatively new challenge that has attracted growing attention from both academia and industry. The major concern in imbalanced learning problem is the performance of learning algorithms in the presence of underrepresented data and severe class distribution skews. In this paper we discuss various methods and approaches for classification of imbalanced multirelational databases.
منابع مشابه
An approach to mining the multi-relational imbalanced database
The class imbalance problem is an important issue in classification of Data mining. For example, in the applications of fraudulent telephone calls, telecommunications management, and rare diagnoses, users would be more interested in the minority than the majority. Although there are many proposed algorithms to solve the imbalanced problem, they are unsuitable to be directly applied on a multire...
متن کاملLearning from Skewed Class Multi-relational Databases
Relational databases, with vast amounts of data–from financial transactions, marketing surveys, medical records, to health informatics observations– and complex schemas, are ubiquitous in our society. Multirelational classification algorithms have been proposed to learn from such relational repositories, where multiple interconnected tables (relations) are involved. These methods search for rel...
متن کاملProposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms
In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...
متن کاملOn Mining Fuzzy Classification Rules for Imbalanced Data
Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...
متن کاملOn Mining Fuzzy Classification Rules for Imbalanced Data
Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...
متن کامل